PNAS Nexus — Latest Matching Preprints

1

Deep Learning Reveals the Modular Genetic Architecture of Cardiovascular Aging

Choi, R. B.; Croon, P. M.; Perera, S.; Oikonomou, E.; Khera, R.

2026-04-24 cardiovascular medicine 10.64898/2026.04.22.26351478 medRxiv

Top 0.1%

3.9%

Show abstract

Chronological age is a potent determinant of clinical events, but it is conventionally treated as a linear function of time rather than a dynamic process shaped by genetics and tissue-specific senescence. Deep learning models derived from cardiovascular imaging offer an opportunity to quantify biological age across multiple domains and to examine the extent to which these measures capture shared or distinct vulnerabilities. Here, we applied deep learning to estimate biological age from electrocardiograms, cardiac MRI, carotid ultrasound, and retinal imaging, capturing electrical, structural, macrovascular, and microvascular domains in more than 100,000 UK Biobank participants. Genome-wide association and cross-trait heritability analyses showed that cardiovascular aging is not a singular process but a modular phenotype with distinct genetic determinants across modalities. Polygenic risk scores supported these distinct trajectories, showing that different biological age measures capture partly divergent biological processes with corresponding differences in clinical associations. Modality-specific genes also showcased distinct cell-type enrichment patterns. By deconvoluting aging into electrical, structural, macrovascular, and microvascular components, our results demonstrate that AI-derived age metrics capture distinct, disease-specific aging pathways. Ultimately, this modular framework positions deep learning-derived aging models not as holistic measures of health, but as domain-specific biomarkers of cardiovascular vulnerability.

2

Modeling the impact of adherence to U.S. isolation and masking guidance on SARS-CoV-2 transmission in office workplaces in 2021-2022

Garcia Quesada, M.; Wallrafen-Sam, K.; Kiti, M. C.; Ahmed, F.; Aguolu, O. G.; Ahmed, N.; Omer, S. B.; Lopman, B. A.; Jenness, S. M.

2026-04-21 epidemiology 10.64898/2026.04.14.26350639 medRxiv

Top 0.1%

2.0%

Show abstract

Non-pharmaceutical interventions (NPIs) have been important for controlling SARS-CoV-2 transmission, particularly before and during initial vaccine rollout. During the pandemic, the US Centers for Disease Control and Prevention issued isolation and masking guidance in case of COVID-19-like illness, a positive SARS-CoV-2 test, or known exposure to SARS-CoV-2. However, the impact of this guidance on mitigating transmission in office workplaces is unclear. We used a network-based mathematical model to estimate the impact of this guidance on SARS-CoV-2 transmission among office workers and their communities. The model represented social contacts in the home, office, and community. We used data from the CorporateMix study to parametrize social contacts among office workers and calibrated the model to represent the COVID-19 epidemic in Georgia, USA from January 2021 through August 2022. In the reference scenario (58% adherence to guidance among office workers and the broader population), workplace transmission accounted for a small fraction of total infections. Reducing adherence among office workers to 0% increased workplace transmissions by 27.1% and increasing adherence to 75% reduced workplace transmission by 7.0%. Increasing adherence to 75% among office workers had minimal impact on symptomatic cases and deaths; increasing it among the broader population was more effective in reducing office worker cases and deaths. In our model, moderate adherence to recommended NPIs in workplaces was effective in reducing transmission, but increasing adherence had limited benefit given workplaces that have low contact intensity and hybrid work arrangements. These results underscore the public health benefits of community-wide adoption of recommended NPIs.

3

A Predictive Model for Coupling Cell Division Orientation to Tissue Mechanics During Epithelial Morphogenesis

AZOTE epse HASSIKPEZI, S.; Negi, R. S.; Chen, N.; Manning, M. L.

2026-04-21 biophysics 10.64898/2026.04.17.719304 medRxiv

Top 0.2%

1.9%

Show abstract

Stratified epithelial tissues such as the skin epidermis maintain barrier integrity during development and homeostasis through the coordinated action of cell proliferation, differentiation, delamination, and tissue-scale mechanical forces. During development, the orientation of cell division within the basal layer plays a pivotal role in tissue stratification; however, the mechanical principles linking the orientation of the division plane to these processes across developmental stages remain poorly understood. Here, we expand a recently developed three-dimensional vertex model for stratified epithelia, composed of the basement membrane, basal, and suprabasal layers, to study the mechanical and structural impact of cell divisions with a wider range of orientations. The model integrates developmental stage via specific changes in heterotypic interfacial tensions (arising from actomyosin cortical contractility and adhesion molecules at the basal-suprabasal interface) and tissue stiffness that have been quantified previously in experiments. By systematically varying background mechanical parameters, we investigate how heterotypic tension, division orientation, and tissue fluidity collectively influence the outcome of cell division. Our goal is to uncover the strategies that the embryo may employ to generate stratified phenotypes at different developmental stages, recognizing that these strategies might evolve over time. Although our focus is on the embryonic developmental stages of the epidermis, this framework may also be extended to investigate transformed cells, such as in cancer, to explore how altered division orientation contributes to precancerous or transformed phenotypes.

4

Diagnostic Delays Drive Transmission in Dense Cities: Modeling the Waiting-Window Effect and Its Mitigation

Bahig, S.; Oughton, M.; Vandesompele, J.; Brukner, I.

2026-04-22 epidemiology 10.64898/2026.04.20.26350946 medRxiv

Top 0.2%

1.7%

Show abstract

In dense urban settings, delays between diagnostic sampling and effective isolation can sustain transmission during peak infectiousness. We define a waiting-window transmission externality arising when infectious individuals remain mobile while awaiting results, formalized as E = N{middle dot}P{middle dot}TR{middle dot}D, where N is daily testing volume, P test positivity, TR transmission during the waiting period, and D turnaround time. Using Monte Carlo simulation and a susceptible-infectious-recovered (SIR) framework, we quantify excess infections per 1,000 tests/day under multiple diagnostic workflows. A surge scenario incorporates positive coupling between TR and D ({rho} = 0.45), reflecting co-occurrence of laboratory saturation and elevated contacts during system stress. Under centralized 48-hour workflows, excess infections reach [~]80 at P = 10% and [~]401 at P = 50%, increasing to [~]628 under surge conditions. In contrast, near-patient rapid testing and home sampling reduce this to [~]5 and [~]25-26, respectively. Workflows that eliminate the waiting window--either through immediate isolation at sampling or through home-based PCR that returns results at the point of collection--effectively collapse the transmission term. These findings identify diagnostic delay as a modifiable driver of epidemic dynamics. Operational redesign of testing workflows, including decentralized sampling and home-based molecular diagnostics, offers a scalable pathway to improve epidemic controllability and reduce inequities in dense urban environments.

5

Decreasing peptide deformylase activity is a beneficial strategy for increasing formaldehyde resistance in Methylobacterium extorquens

Hellenbrand, C. N.; Miller, T. J.; Kemna, E. M.; Bruger, E. L.; Hying, Z. T.; Bazurto, J. V.

2026-04-21 physiology 10.64898/2026.04.16.718930 medRxiv

Top 0.7%

1.2%

Show abstract

Formaldehyde is a highly toxic metabolite that can cause extensive damage to DNA and proteins, and strategies to mitigate formaldehyde toxicity are poorly understood. Methylotrophic bacteria, such as Methylobacterium extorquens, thrive on one-carbon compounds as sole sources of carbon and energy. These organisms are excellent models for discovering formaldehyde stress response systems because formaldehyde is an obligate intermediate in their central carbon metabolism. Here, we characterize an evolved def allele (defevo) that increases formaldehyde resistance in M. extorquens. The def gene encodes peptide deformylase (PDF, EC:3.5.1.88), an enzyme that contributes to protein processing by removing the formyl group from N-formylmethionine (fMet) on nascent peptides. The defevoallele has a single missense mutation that decreases PDF activity both in vitro and in vivo. Transcriptomic analysis of the defevo strain indicates there are pleiotropic effects of this mutation and a differential response to formaldehyde stress. We investigate possible mechanisms for the defevo mutants increased resistance to formaldehyde, including mitigation of formaldehyde-induced protein stress and altered membrane physiology. We find that the defevo allele selectively alleviates exogenous, but not endogenous, formaldehyde stress and identify a tradeoff in heat shock resistance. This study reports the first observation of lowered PDF activity benefiting a cellular physiological phenotype. Our work indicates that altered protein metabolism can mitigate the toxic effects of formaldehyde and furthers our understanding of the strategies that can protect cells from formaldehyde-induced damage. ImportanceFormaldehyde is a toxic chemical that can damage essential molecules inside of cells, yet all organisms inevitably produce it during normal metabolism. Despite its ubiquity, our understanding of strategies for how cells navigate formaldehyde toxicity is incomplete. This study focuses on Methylobacterium extorquens, which naturally generates high levels of formaldehyde as part of its growth on simple carbon compounds. We show herein that a single genetic change, which slows down how newly made proteins are processed during translation, can unexpectedly improve the bacteriums ability to resist formaldehyde stress. Further, we show that this single change has numerous effects on the cell, many of which may contribute to formaldehyde resistance.

6

Positional information and information flows in dynamic tissues

Plum, A. M.; Serra, M.

2026-04-20 biophysics 10.64898/2026.04.15.718553 medRxiv

Top 1%

0.9%

Show abstract

During development, embryos store, transmit, and transform information to generate spatial patterns. Positional information (PI) quantifies how precisely cells form patterns at a given time, but cell motion has limited its application to static tissues. We introduce a framework for PI in dynamic tissues by decomposing mutual information between cells positions and properties over time into information flows contributing to PI preservation, loss and generation. These reveal information-theoretic signatures of ubiquitous developmental processes, including instruction, sorting and mixing, directly from data. Applying this framework to whole-embryo cell trajectories in Drosophila, mouse and zebrafish gastrulation, we provide local and global information-theoretic quantification of cell mixing and derive bounds on PI preservation imposed by tissue dynamics. Analyzing tissue flows as dynamical systems, we further show that morphogenesis structures mixing, preferentially preserving specific patterns. Finally, we derive inequality conditions for tracing generated PI to candidate information sources and distinguishing among alternative pattern-formation mechanisms, from programmed extracellular cues to self-organizing intercellular interactions.

7

The effect of long-term management on wild pig (Sus scrofa x domesticus) populations across the southeastern United States

Foster, J. R.; Pepin, K.; Miller, R.

2026-04-21 ecology 10.64898/2026.04.16.719012 medRxiv

Top 1%

0.9%

Show abstract

O_LIThe management of invasive species often emphasizes removals to manage populations. However, evaluating the success of this management technique remains challenging, especially at large scales. Understanding the relationship between removal intensity and population growth is essential for determining when management achieves desired outcomes. C_LIO_LIWe used management removal data (removal resources [e.g. trapping] and relative effort [trap nights]) to estimate population density, demographic structure, and growth rates of invasive wild pigs (Sus scrofax domesticus) across a large landscape. From the management data and population estimates, we inferred population trajectories in the absence of removals and quantified the proportion of the population removed by the most widely used methods to control wild pigs. We then compared observed removal intensities and population growth rates to predict expected population trajectories immediately after management occurs. C_LIO_LIResults suggest substantial spatial and temporal variation in wild pig growth rates and variation in the effectiveness of removal efforts. Additionally, removing wild pigs at higher densities had a greater effect on limiting population growth than removals conducted at lower densities, though both are important. However, on large properties, removal intensity was often insufficient to offset population growth, indicating that management effort does not scale to large areas. C_LIO_LIThese results demonstrate how removal data and population modeling can provide robust inference on population dynamics and management effectiveness, offering a scalable framework for evaluating and improving invasive species control programs. We also discuss the current limitation of how effort is defined for different large-mammal removal techniques, and offer potential solutions for a more complete definition, such as going beyond trap nights and including constraints on personnel, equipment, and logistics. C_LI

8

Local gated-Hebbian learning of deep cerebellar networks with quadratic classification capacity

Hiratani, N.

2026-04-20 neuroscience 10.64898/2026.04.17.718957 medRxiv

Top 1%

0.9%

Show abstract

A central goal of neuroscience is to understand how neural circuit architecture supports learning. While recent work has clarified the computational role of depth in sensory cortical hierarchies, it remains unclear why predominantly feedforward, non-convolutional circuits such as the cerebellum and olfactory system also contain multiple processing layers. Theoretical work in deep learning has shown that two-hidden-layer networks can achieve classification capacity that scales quadratically with the number of intermediate neurons, but these results rely on nonlocal synaptic optimization and are therefore difficult to reconcile with biological learning rules. Here, we show analytically and numerically that a two-hidden-layer network with feedforward gating can achieve quadratic capacity using local three-factor Hebbian learning when intermediate activity is sparse. This architecture supports efficient one-shot learning and, in settings where backpropagation requires many repeated weight updates, offers an advantage in learning speed. Beyond random perceptron tasks, the model also performs well on structured cerebellum-related tasks, including reinforcement-learning-based motor control. Mapping the model onto cerebellar microcircuitry further suggests functional roles for dendritic compartmentalization, branch-specific inhibition, and disinhibitory interneuron pathways. Together, these results extend the Marr-Albus-Ito framework by showing how the presence of multiple intermediate layers in cerebellum-like circuits can support fast, local, and high-capacity learning.

9

Vaginal metabolome signatures of high-risk HPV infection trajectories in HIV-negative premenopausal women

Adebamowo, C.; Adebamowo, S. N. N.; Gbolahan, T.; Ikwueme, O.; Famooto, A.; Owoade, Y.; ACCME Research Group as part of H3Africa Consortium,

2026-04-22 epidemiology 10.64898/2026.04.21.26351401 medRxiv

Top 1%

0.8%

Show abstract

Persistent detection of high-risk human papillomavirus (HPV) is required for cervical carcinogenesis, yet the metabolic phenotype associated with distinct HPV transition states remains incompletely defined. We analyzed vaginal metabolomics data from 71 HIV-negative, non-smoking, premenopausal women without other sexually transmitted infections, grouped by three-visit HPV trajectories: persistent negative (NNN, n=20), late incident positivity (NNP, n=9), conversion with persistence (NPP, n=13), clearance after prior positivity (PPN, n=16), and persistent positive (PPP, n=13). After detection-based filtering, 186 putative and 64 quantitatively estimated metabolites were retained for integrated univariate, multivariate, network, pathway, and machine learning analyses. Global class separation was weak by PERMANOVA and by five-class classification, indicating that the vaginal metabolome does not reorganize broadly across all HPV states. In contrast, trajectory-specific signals were reproducible. The strongest pairwise contrast was NNP versus PPP (best cross-validated ROC AUC 0.778; permutation p=0.039). Glycolic acid was the dominant single metabolite, particularly for NNP versus PPP (Mann-Whitney p=6.96x10^-4, FDR=0.0446, AUROC=0.902; detection 88.9% versus 15.4%; combined abundance+detection FDR=0.0010). Persistent positivity was characterized by a focused uracil-high, methyl-donor/redox-low signature, including lower glycolic acid, S-adenosylmethionine, NAD+, and betaine, together with higher uracil. Ratio mining further sharpened discrimination, with uracil/S-adenosylmethionine and uracil/creatinine among the best PPP classifiers, and glucose 1-phosphate/isovaleric acid-valeric acid strongly separating NNP from NPP. These data support a model in which HPV trajectory is encoded by targeted metabolic states rather than a diffuse HPV-positive versus HPV-negative metabolomic shift.

10

The Immunoglobulin G Glycome: A Modifiable Biomarker and Functional Effector of Aging, Disease, and Mortality

Mijakovac, A.; Butz, E.; Vuckovic, F.; Frkatovic Hodzic, A.; Rapcan, B.; Kifer, D.; Deris, H.; Radovani Trbojevic, B.; Luksic, F.; Cindric, A.; Gudelj, I.; simunic Briski, N.; Josipovic, G.; Stara Yuksel, Z.; catic, J.; saler, F.; Szavits-Nossan, J.; Hedin, C. R. H.; simunovic, J.; Borosak, I.; Kristic, J.; Monteiro-Martins, S.; Pribic, T.; Hanic, M.; Pucic-Bakovic, M.; Trbojevic-Akmacic, I.; stambuk, T.; stambuk, J.; Martinic Kavur, M.; Fancovic, M.; Cvetko, A.; Pezer, M.; Polasek, O.; Gornik, O.; Kiprov, D.; Verdin, E.; Younggren, B.; Newson, L.; Menni, C.; Steves, C. J.; Spector, T. D.; Hal

2026-04-23 epidemiology 10.64898/2026.04.21.26351390 medRxiv

Top 1%

0.8%

Show abstract

Glycosylation is a key structural modification of immunoglobulin G (IgG) that modulates its effector functions and has multiple roles in balancing inflammation. Altered IgG glycosylation has been reported in many diseases, often years before clinical manifestation, suggesting its causal role and biomarker potential. Here, we analyzed IgG glycome composition in 20,405 individuals from 42 different studies processed at the Genos Glycoscience Research Laboratory between 2008 and 2025. Across nearly all diseases, specific IgG glycome profiles reflected accelerated biological aging. Accelerated glycan aging was strongly associated with increased risk of all-cause mortality, independent of established clinical risk factors and potential confounders. Moreover, interventions known to reduce mortality risk, including hormone replacement therapy, therapeutic plasma exchange and caloric restriction, were associated with reversal of glycan aging. Given their role in modulating low-grade systemic inflammation, IgG glycans may represent a functional link between chronic inflammation, aging, disease susceptibility and all-cause mortality.

11

Fentanyl Purity and Overdose Decline: A Reexamination of Geographic Trends

Dasgupta, N.; Sibley, A. L.; Gildner, P.; Gora Combs, K.; Post, L. A.; Tobias, S.; Kral, A. H.; Pacula, R. L.

2026-04-24 epidemiology 10.64898/2026.04.23.26351605 medRxiv

Top 1%

0.8%

Show abstract

Drug overdose deaths in the United States reached record levels during the fentanyl era before recently declining. A plausible hypothesis is that a sudden drop in fentanyl purity beginning in 2023 caused the downturn in overdose mortality. We evaluated this hypothesis by replicating a published analysis with regional overdose data, using models that account for time trends and autocorrelation, and negative control indicators to test for spurious correlation. When fentanyl purity was rising, the national purity series did not track overdose increases in most regions and showed only a modest association in the West. When both purity and mortality later declined, the observed associations were also seen with unrelated macroeconomic indicators that shared the same time pattern. National fentanyl purity alone does not provide a sufficient explanation for recent overdose declines.

12

Interpretability as stability under perturbation reveals systematic inconsistencies in feature attribution

Piorkowska, N. J.; Olejnik, A.; Ostromecki, A.; Kuliczkowski, W.; Mysiak, A.; Bil-Lula, I.

2026-04-22 health informatics 10.64898/2026.04.20.26351354 medRxiv

Top 2%

0.7%

Show abstract

Interpreting machine learning models typically relies on feature attribution methods that quantify the contribution of individual variables to model predictions. However, it remains unclear whether attribution magnitude reflects the true functional importance of features for model performance. Here, we present a unified interpretability framework integrating permutation-based attribution, feature ablation, and stability under perturbation across multiple feature spaces. Using nested cross-validation and permutation-based null diagnostics, we systematically evaluate the relationship between attribution magnitude and functional dependence in clinical and biomarker-based prediction models. Attribution magnitude is frequently misaligned with functional importance, with weak to strong negative correlations observed across feature spaces (Spearman {rho} ranging from -0.374 to -0.917). Features with high attribution often have limited impact on model performance when removed, whereas features with low attribution can be essential for maintaining predictive accuracy. These discrepancies define distinct classes of interpretability failure, including attribution excess and latent dependence. Interpretability further depends on feature space composition, and stable, functionally relevant features are not necessarily those with the highest attribution scores. By integrating attribution, functional impact, and stability into a composite Feature Reliability Score, we identify features that remain informative across perturbations and analytical contexts. These findings indicate that interpretability does not arise from attribution magnitude alone but is better characterized from stability under perturbation. This framework provides a basis for more robust model interpretation and highlights limitations of attribution-centric approaches in high-dimensional and correlated data settings.

13

Particle lability drives degradation dynamics and bacterial community assembly during a Phaeocystis bloom decline

Romanelli, E.; Stevens-Green, R.; Cisternas-Novoa, C.; LaRoche, J.; Siegel, D. A.; Carlson, C. A.; Passow, U.

2026-04-20 microbiology 10.64898/2026.04.19.716305 medRxiv

Top 2%

0.7%

Show abstract

Microbial degradation of suspended and sinking organic carbon regulates long-term oceanic carbon storage by controlling the efficiency of the biological pump. Yet microbial controls on carbon export and remineralization remain poorly constrained, limiting predictions of how ocean carbon cycling will respond to climate change. Here, we combined in situ sampling with ship-based incubations to quantify prokaryote-driven removal rates of suspended and sinking total organic carbon (TOC). Samples were collected below the mixed layer during three stages of a spring Phaeocystis pouchetii bloom in the Labrador Sea. Phaeocystis blooms can dominate regional phytoplankton biomass and are expected to increase under future climate. Removal rates were used as a proxy for carbon lability and combined with 16S rRNA metabarcoding and carbon composition analyses to link microbial community structure with substrate characteristics. Removal rates of sinking particles (0.02-0.06 d-1) were an order of magnitude higher than those of suspended TOC (0.002 d-1) during bloom-decline and non-bloom. In contrast, during late-bloom, suspended carbon exhibited rates of 0.01 d-1, comparable to sinking particles, and was enriched in exopolymer-rich colonies. Prokaryotic community composition varied primarily among bloom stages rather than carbon fractions, indicating that bloom stage-- and thus particle origin and composition--was the dominant control on bacterial degradation and assembly. Bacterial diversity peaked where carbon was refractory and originated from mixed phytoplankton. Together, these results demonstrate that suspended Phaeocystis-derived carbon can be rapidly remineralized when blooms produce exopolymer-rich colonies and highlight bloom stage as key regulator of microbial carbon processing and biological pump efficiency.

14

CT-Based Deep Foundation Model for Predicting Immune Checkpoint Inhibitor-Induced Pneumonitis Risk in Lung Cancer

Muneer, A.; Showkatian, E.; Kitsel, Y.; Saad, M. B.; Sujit, S. J.; Soto, F.; Shroff, G. S.; Faiz, S. A.; Ghanbar, M. I.; Ismail, S. M.; Vokes, N. I.; Cascone, T.; Le, X.; Zhang, J.; Byers, L. A.; Jaffray, D.; Chang, J. Y.; Liao, Z.; Naing, A.; Gibbons, D. L.; Vaporciyan, A. A.; Heymach, J. V.; Suresh, K. S.; Altan, M.; Sheshadri, A.; Wu, J.

2026-04-23 oncology 10.64898/2026.04.21.26351428 medRxiv

Top 2%

0.7%

Show abstract

Background: Immune checkpoint inhibitors (ICIs) have revolutionized cancer therapy but can cause serious immune-related adverse events (irAEs), with pneumonitis (ICI-P) being among the most severe. Early identification of high-risk patients before ICI initiation is critical for closer monitoring, timely intervention, and improved outcomes. Purpose: To develop and validate a deep learning foundation model to predict ICI-P from baseline CT scans in patients with lung cancer. Methods: We designed the Checkpoint-Inhibitor Pneumonitis Hazard EstimatoR (CIPHER), a deep learning foundation model that combines contrastive learning with a transformer-based masked autoencoder to predict ICI-P from baseline CT scans in patients with lung cancer. Using self-supervised learning, CIPHER was pre-trained on 590,284 CT slices from 2,500 non-small cell lung cancer (NSCLC) patients to capture heterogeneous lung parenchymal patterns. After pre-training, the model was fine-tuned on an internal NSCLC cohort for ICI-P risk prediction, using images from 254 patients for model development and 93 patients for internal validation. We compared CIPHER with classical radiomic models and further evaluated it on an external NSCLC cohort of 116 patients. Results: In the internal immunotherapy cohort, CIPHER consistently distinguished patients at elevated risk of ICI-P from those without the event, with AUCs ranging from 0.77 to 0.85. In head-to-head benchmarking, CIPHER achieved an AUC of 0.83, outperforming the radiomic models. In the external validation cohort, CIPHER maintained strong performance (AUC = 0.83; balanced accuracy = 81.7%), exceeding the radiomic models (DeLong p = 0.0318) and demonstrating higher specificity without sacrificing sensitivity. By contrast, the radiomic model showed high sensitivity (85.0%) but markedly lower specificity (45.8%). Confusion matrix analysis confirmed the robust classification performance of CIPHER, correctly identifying 80 of 96 non-ICI-P cases and 16 of 20 ICI-P cases. Conclusions: We developed and externally validated CIPHER for predicting future risk of ICI-P from pre-treatment CT scans. With prospective validation, CIPHER may be incorporated into routine patient management to improve outcomes.

15

Transcriptomic subtypes in high-grade serous ovarian cancer are driven by tumor cellular composition

Tanis, S.; Lixandrao, M.; Ivich, A.; Grieshober, L.; Lawson-Michod, K. A.; Collin, L. J.; Peres, L. C.; Salas, L. A.; Marks, J. R.; Bitler, B. G.; Greene, C. S.; Schildkraut, J. M.; Doherty, J. A.; Davidson, N. R.

2026-04-21 cancer biology 10.64898/2026.04.16.719000 medRxiv

Top 2%

0.7%

Show abstract

High-grade serous ovarian carcinoma (HGSC) is an aggressive malignancy for which bulk transcriptomic subtypes are used to stratify tumors, interpret biology, and guide biomarker development. The four TCGA-derived subtypes, mesenchymal (C1.MES), immunoreactive (C2.IMM), proliferative (C5.PRO), and differentiated (C4.DIF), are consistently observed across cohorts. However, despite their prominence, these subtypes have not translated into therapeutic utility, and their biological basis remains unresolved. Here, we show that HGSC transcriptomic subtypes are largely determined by tumor cellular composition rather than intrinsic malignant transcriptional programs. By integrating controlled single-cell-derived pseudobulk simulations with deconvolution-based analysis of 1,834 primary HGSC tumors across RNA-seq and microarray cohorts, we demonstrate that subtype probabilities align along a composition-driven axis of stromal and immune variation. Cellular composition alone predicted subtype labels with high accuracy (ROC-AUC = 0.81-0.95) and explained a substantial fraction of subtype-associated transcriptomic variation, with the mesenchymal (C1.MES) subtype representing the most robust and reproducible example of composition-driven signal. Although a secondary, composition-independent expression signal is detectable, it does not define the dominant structure of subtype classification. These findings redefine HGSC transcriptomic subtypes as features of the tumor ecosystem rather than discrete malignant states. This reinterpretation has immediate implications for studies that use subtype labels to infer tumor-intrinsic biology and provides a generalizable framework for separating composition-driven and intrinsic signals in bulk tumor data. Significance StatementHGSC transcriptomic subtypes lack consistent clinical utility and remain biologically ambiguous. We show subtype assignments are largely driven by tumor cellular composition, and less so by distinct intrinsic tumor states.

16

Metabolic feedback during bacterial fermentation is a motility brake

Le Nagard, L.; Schwarz-Linek, J.; Krasnopeeva, E.; Douarche, C.; Arlt, J.; Dawson, A.; Martinez, V.; Poon, W. C. K.; Pilizota, T.

2026-04-19 microbiology 10.64898/2026.04.18.717966 medRxiv

Top 2%

0.7%

Show abstract

We study an unexpectedly fast decay of motility in dense suspensions of Escherichia coli bacteria supplied with excess glucose under anaerobic conditions. The decrease in swimming speed occurs on a timescale inversely proportional to the cell concentration, and is associated with the secretion of organic acids by the bacteria. We show that the decay is driven by the progressive accumulation of non-ionised organic acids in the medium, and develop a chemical kinetic model that successfully predicts the swimming speed variations over a range of conditions in the presence of these acids. We further measure the internal pH of E. coli cells exposed to organic acids, and find that the speed decay coincides with sharp declines in internal pH and metabolic rate. Our findings identify an additional layer of motility control that can arise in complex environments even when motility genes are expressed and energy sources are abundant. This mechanism is likely relevant for understanding bacterial motility in habitats such as the human gut, where high densities of bacteria and organic acids are common.

17

Multimodal Integration of Ambulatory ECG and Clinical Features for Sudden Cardiac Death and Pump Failure Death Prediction

Swee, S.; Adam, I.; Zheng, E. Y.; Ji, E.; Wang, D.; Speier, W.; Hsu, J.; Chang, K.-W.; Shivkumar, K.; Ping, P.

2026-04-22 cardiovascular medicine 10.64898/2026.04.21.26351421 medRxiv

Top 3%

0.6%

Show abstract

Ambulatory electrocardiograms (ECG) provides continuous monitoring of the hearts electrical activity. However, many existing machine learning and artificial intelligence models for analyzing ambulatory ECG traces are often unimodal and do not incorporate patient clinical context. In this study, we propose a multimodal framework integrating ambulatory ECG-derived representations with clinical text embeddings to predict two cardiac outcomes: sudden cardiac death and pump failure death. Ambulatory ECG traces are preprocessed, segmented, and encoded via a multiple instance learning and temporal convolutional neural network framework. In parallel, patient clinical features are parsed into structured prompts, which are passed through a large language model to generate clinical reasoning; this reasoning passes through a biomedical language encoder to generate a text embedding. With the ECG and text embeddings, we systematically evaluate multiple fusion strategies, including concatenation- and gating-based approaches, to integrate these two data modalities. Our results demonstrate that multimodal models consistently outperform unimodal baselines, with adaptive fusion mechanisms providing the greatest improvements in predictive performance. Decision curve analysis highlights the potential clinical utility of the proposed framework for risk stratification. Finally, we visualize model attention across modalities, including ECG attention patterns, segment-level saliency, heart rate variability features, and clinical reasoning, to contextualize patient-specific predictions.

18

Temporal Dissociation of Syntactic Disambiguation and Memory Retrieval during Sentence Processing: Naturalistic MEG Evidence from Interpretable Models

Dunagan, D.; Low, D. S.; Yue, S.; Meyer, L.; Hale, J.

2026-04-21 neuroscience 10.64898/2026.04.20.719609 medRxiv

Top 3%

0.6%

Show abstract

Human sentence comprehension proceeds word-by-word, with prior research proposing two central sources of cognitive demand during incremental processing: forward-looking disambiguation of the incoming information stream, and backward-looking retrieval of information associated with previous words from working memory. Recent work has shown that Transformer-based language models successfully generate predictions about sentence processing load in human psycho- and neurolinguistic data by operationalizing disambiguation cost as next-token surprisal, and memory retrieval cost as normalized attention entropy (NAE). Such models, however, remain difficult to interpret as it is not well understood what factors play causally into the decision to assign a cost value to a given word in such artificial neural networks. Here, we present interpretable and cognitively grounded models of disambiguation and memory retrieval and evaluate their neural alignment and spatio-temporal correlates using human magnetoencephalography responses to naturalistic narrative speech. Multivariate temporal response function modeling demonstrates firstly that these human-bias-informed models fare equally well in accounting for observed human language processing data as their Transformer counterparts. This same modeling framework then suggests that surprisal and NAE temporally dissociate in the cortical language network -- surprisal being predictive of bilateral superior temporal gyrus and supramarginal gyrus activation [~]300-500 ms, and NAE being predictive of activity in the same regions, but later [~]750-850 ms. By demonstrating that interpretable neurocomputational models can achieve meaningful brain alignment while maintaining explanatory transparency, this work offers a methodological blueprint for bridging the gap between algorithmic theory and neural implementation.

19

A global metagenomic atlas uncovers ubiquitous biosynthetic potential linked to adaptation in extreme environments

Du, R.; He, R.; Qi, Q.; Li, Z.; Tang, Q.; Zhang, Z.; Xu, X.; Peng, H.; Liu, J.; Medema, M. H.; Xu, Q.

2026-04-20 microbiology 10.64898/2026.04.17.719132 medRxiv

Top 3%

0.6%

Show abstract

Extreme environments impose severe physicochemical stresses that drive microorganisms to evolve specialized survival strategies. Microbial secondary metabolites determined by biosynthetic gene clusters (BGCs) are recognized as important mediators of microbial adaptation to environmental stress. However, their ecological roles, particularly habitat-dependent preferences across different environments, remain poorly understood. Although extreme environments provide opportunities to mine microbiomes for unique adaptations, such research is hampered by a lack of systematic overview of its genomic diversity, BGC diversity, and the relationships between them. Here, we constructed a standardized extremophilic genomic catalogue (SEGC) from 1,462 metagenomic samples spanning seven representative extreme habitats. The catalogue comprised 54,661 metagenome-assembled genomes representing 21,805 species, 66.1% of which were previously uncharacterized. With this catalogue, we identified 162,855 BGCs distributed across 81.5% of MAGs. Gene cluster family analysis showed the strong habitat dependence largely explained by species-level habitat specificity. Terpene biosynthetic pathways illustrated habitat-linked adaptive strategies, with hopan-22-ol biosynthesis enriched in acid mine, deep sea and hydrothermal plume environments, while retinal-based phototrophy predominated in cryosphere and saline-alkaline habitats. Metatranscriptomic analyses supported in situ activity of these pathways. In conclusion, we presented a global atlas of biosynthetic potential across extreme-environment microbiota and revealed habitat-dependent patterns of secondary metabolism linked to microbial survival.

20

Mistake gating leads to energy and memory efficient continual learning

Pache, A.; van Rossum, M. C. W.

2026-04-20 neuroscience 10.64898/2026.04.16.718919 medRxiv

Top 3%

0.6%

Show abstract

Synaptic plasticity is metabolically expensive, yet animals continuously update their internal models without exhausting energy reserves. However, when artificial neural networks are trained, the network parameters are typically updated on every sample that is presented, even if the sample was classified correctly. Inspired by the human negativity bias and error-related negativity, we propose memorized mistake-gated learning--a biologically plausible plasticity rule where synaptic updates are strictly gated by current and past classification errors. This reduces the number of updates the network needs to make by 50% [~] 80%. Mistake gating is particularly well suited in two cases: 1) For incremental learning where new knowledge is acquired on a background of pre-existing knowledge, 2) For online learning scenarios when data needs to be stored for later replay, as mistake-gating reduces storage buffer requirements. The algorithm can be implemented in a few lines of code, adds no hyper-parameters, and comes at negligible computational overhead. Learning on mistakes is an energy efficient and biologically relevant modification to commonly used learning rules that is well suited for continual learning.